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Abstract. Statistics has moved beyond the frequentist-Bayesian con- 
troversies of the past. Where does this leave our ability to interpret re- 
sults? I suggest that a philosophy compatible with statistical practice, 
labeled here statistical pragmatism, serves as a foundation for infer- 
ence. Statistical pragmatism is inclusive and emphasizes the assump- 
tions that connect statistical models with observed data. I argue that 
introductory courses often mischaracterize the process of statistical in- 
ference and I propose an alternative "big picture" depiction. 

Key words and phrases: Bayesian, confidence, frequentist, statistical 
education, statistical pragmatism, statistical significance. 



1. INTRODUCTION 

The protracted battle for the foundations of statis- 
tics, joined vociferously by Fisher, Jeffreys, Neyman, 
Savage and many disciples, has been deeply illumi- 
nating, but it has left statistics without a philoso- 
phy that matches contemporary attitudes. Because 
each camp took as its goal exclusive ownership of 
inference, each was doomed to failure. We have all, 
or nearly all, moved past these old debates, yet our 
textbook explanations have not caught up with the 
eclecticism of statistical practice. 

The difficulties go both ways. Bayesians have de- 
nied the utility of confidence and statistical signifi- 
cance, attempting to sweep aside the obvious success 
of these concepts in applied work. Meanwhile, for 
their part, frequentists have ignored the possibility 
of inference about unique events despite their ubiq- 
uitous occurrence throughout science. Furthermore, 
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interpretations of posterior probability in terms of 
subjective belief, or confidence in terms of long-run 
frequency, give students a limited and sometimes 
confusing view of the nature of statistical inference. 
When used to introduce the expression of uncer- 
tainty based on a random sample, these caricatures 
forfeit an opportunity to articulate a fundamental 
attitude of statistical practice. 

Most modern practitioners have, I think, an open- 
minded view about alternative modes of inference, 
but are acutely aware of theoretical assumptions and 
the many ways they may be mistaken. I would sug- 
gest that it makes more sense to place in the center 
of our logical framework the match or mismatch of 
theoretical assumptions with the real world of data. 
This, it seems to me, is the common ground that 
Bayesian and frequentist statistics share; it is more 
fundamental than either paradigm taken separately; 
and as we strive to foster widespread understanding 
of statistical reasoning, it is more important for be- 
ginning students to appreciate the role of theoret- 
ical assumptions than for them to recite correctly 
the long-run interpretation of confidence intervals. 
With the hope of prodding our discipline to right 
a lingering imbalance, I attempt here to describe the 
dominant contemporary philosophy of statistics. 

2. STATISTICAL PRAGMATISM 

I propose to call this modern philosophy statisti- 
cal pragmatism. I think it is based on the following 
attitudes: 
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1. Confidence, statistical significance, and posterior 
probability are all valuable inferential tools. 

2. Simple chance situations, where counting argu- 
ments may be based on symmetries that generate 
equally likely outcomes (six faces on a fair die; 52 
cards in a shuffled deck), supply basic intuitions 
about probability. Probability may be built up 
to important but less immediately intuitive situ- 
ations using abstract mathematics, much the way 
real numbers are defined abstractly based on in- 
tuitions coming from fractions. Probability is use- 
fully calibrated in terms of fair bets: another way 
to say the probability of rolling a 3 with a fair 
die is 1/6 is that 5 to 1 odds against rolling a 3 
would be a fair bet. 

3. Long-run frequencies are important mathemati- 
cally, interpretively, and pedagogically. However, 
it is possible to assign probabilities to unique 
events, including rolling a 3 with a fair die or 
having a confidence interval cover the true mean, 
without considering long-run frequency. Long-run 
frequencies may be regarded as consequences of 
the law of large numbers rather than as part of 
the definition of probability or confidence. 

4. Similarly, the subjective interpretation of poste- 
rior probability is important as a way of under- 
standing Bayesian inference, but it is not funda- 
mental to its use: in reporting a 95% posterior 
interval one need not make a statement such as, 
"My personal probability of this interval covering 
the mean is 0.95." 

5. Statistical inferences of all kinds use statistical 
models, which embody theoretical assumptions. 
As illustrated in Figure 1, like scientific models, 
statistical models exist in an abstract framework; 
to distinguish this framework from the real world 
inhabited by data we may call it a "theoretical 
world." Random variables, confidence intervals, 
and posterior probabilities all live in this theo- 
retical world. When we use a statistical model to 
make a statistical inference we implicitly assert 
that the variation exhibited by data is captured 
reasonably well by the statistical model, so that 
the theoretical world corresponds reasonably well 
to the real world. Conclusions are drawn by ap- 
plying a statistical inference technique, which is 
a theoretical construct, to some real data. Fig- 
ure 1 depicts the conclusions as straddling the 
theoretical and real worlds. Statistical inferences 
may have implications for the real world of new 
observable phenomena, but in scientific contexts, 
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Fig. 1. The big picture of statistical inference. 
Statistical procedures are abstractly defined in 
terms of mathematics but are used, in conjunc- 
tion with scientific models and methods, to explain 
observable phenomena. This picture emphasizes the hypothet- 
ical link between variation in data and its description using 
statistical models. 

conclusions most often concern scientific models 
(or theories), so that their "real world" implica- 
tions (involving new data) are somewhat indirect 
(the new data will involve new and different ex- 
periments) . 

The statistical models in Figure 1 could involve 
large function spaces or other relatively weak prob- 
abilistic assumptions. Careful consideration of the 
connection between models and data is a core com- 
ponent of both the art of statistical practice and 
the science of statistical methodology. The purpose 
of Figure 1 is to shift the grounds for discussion. 

Note, in particular, that data should not be con- 
fused with random variables. Random variables live 
in the theoretical world. When we say things like, 
"Let us assume the data are normally distributed" 
and we proceed to make a statistical inference, we 
do not need to take these words literally as asserting 
that the data form a random sample. Instead, this 
kind of language is a convenient and familiar short- 
hand for the much weaker assertion that, for our 
specified purposes, the variability of the data is ade- 
quately consistent with variability that would occur 
in a random sample. This linguistic amenity is used 
routinely in both frequentist and Bayesian frame- 
works. Historically, the distinction between data and 
random variables, the match of the model to the 
data, was set aside, to be treated as a separate topic 
apart from the foundations of inference. But once 
the data themselves were considered random vari- 
ables, the frequentist-Bayesian debate moved into 
the theoretical world: it became a debate about the 
best way to reason from random variables to infer- 
ences about parameters. This was consistent with 
developments elsewhere. In other parts of science, 
the distinction between quantities to be measured 
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Fig. 2. (A) BARS fits to a pair of peri- stimulus time histograms displaying neural firing rate of a particular neuron under 
two alternative experimental conditions. (B) The two BARS fits are overlaid for ease of comparison. 



and their theoretical counterparts within a mathe- 
matical theory can be relegated to a different sub- 
ject — to a theory of errors. In statistics, we do not 
have that luxury, and it seems to me important, 
from a pragmatic viewpoint, to bring to center stage 
the identification of models with data. The purpose 
of doing so is that it provides different interpreta- 
tions of both frequentist and Bayesian inference, in- 
terpretations which, I believe, are closer to the atti- 
tude of modern statistical practitioners. 

A familiar practical situation where these issues 
arise is binary regression. A classic example comes 
from a psychophysical experiment conducted by 
Hecht, Schlaer and Pirenne (1942), who investigated 
the sensitivity of the human visual system by con- 
structing an apparatus that would emit flashes of 
light at very low intensity in a darkened room. Those 
authors presented light of varying intensities repeat- 
edly to several subjects and determined, for each in- 
tensity, the proportion of times each subject would 
respond that he or she had seen a flash of light. 
For each subject the resulting data are repeated bi- 
nary observations ( "yes" perceived versus "no" did 
not perceive) at each of many intensities and, these 
days, the standard statistical tool to analyze such 
data is logistic regression. We might, for instance, 
use maximum likelihood to find a 95% confidence in- 
terval for the intensity of light at which the subject 



would report perception with probability p = 0.5. 
Because the data reported by Hecht et al. involved 
fairly large samples, we would obtain essentially the 
same answer if instead we applied Bayesian methods 
to get an interval having 95% posterior probability. 
But how should such an interval be interpreted? 

A more recent example comes from DiMatteo, Gen- 
ovese and Kass (2001), who illustrated a new non- 
parametric regression method called Bayesian adap- 
tive regression splines (BARS) by analyzing neu- 
ral firing rate data from inferotemporal cortex of 
a macaque monkey. The data came from a study ul- 
timately reported by Rollenhagen and Olson (2005), 
which investigated the differential response of indi- 
vidual neurons under two experimental conditions. 
Figure 2 displays BARS fits under the two condi- 
tions. One way to quantify the discrepancy between 
the fits is to estimate the drop in firing rate from 
peak (the maximal firing rate) to the trough im- 
mediately following the peak in each condition. Let 
us call these peak minus trough differences, under 
the two conditions, 1 and </> 2 . Using BARS, DiMat- 
teo, Genovese and Kass reported a posterior mean 
of (j) 1 — <p 2 = 50.0 with posterior standard deviation 
(±20.8). In follow-up work, Wallstrom, Liebner and 
Kass (2008) reported very good frequentist cover- 
age probability of 95% posterior probability inter- 
vals based on BARS for similar quantities under 
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simulation conditions chosen to mimic such experi- 
mental data. Thus, a BARS-based posterior interval 
could be considered from either a Bayesian or fre- 
quentist point of view. Again we may ask how such 
an inferential interval should be interpreted. 

3. INTERPRETATIONS 

Statistical pragmatism involves mildly altered in- 
terpretations of frequentist and Bayesian inference. 
For definiteness I will discuss the paradigm case of 
confidence and posterior intervals for a normal mean 
based on a sample of size n, with the standard de- 
viation being known. Suppose that we have n = 49 
observations that have a sample mean equal to 10.2. 

Frequentist assumptions. Suppose Xi,X 2 , 
. . . , X n are i.i.d. random variables from a normal 
distribution with mean fi and standard deviation 
cr = l. In other words, suppose X\,X 2 , ■ ■ ■ ,X n form 
a random sample from a N(fi, 1) distribution. 

Noting that x = 10.2 and \/49 = 7 we define the 
inferential interval 

1 = (10.2 - |,10.2 + |). 

The interval / may be regarded as a 95% confidence 
interval. I now contrast the standard frequentist in- 
terpretation with the pragmatic interepretation. 

Frequentist interpretation of confidence 
interval. Under the assumptions above, if we we- 
re to draw infinitely many random samples from a 
N(fx, 1) distribution, 95% of the corresponding con- 
fidence intervals (X — |, X + |) would cover /u. 

Pragmatic interpretation of confidence 
interval. If we were to draw a random sample 
according to the assumptions above, the resulting 
confidence interval (X — | , X + | ) would have prob- 
ability 0.95 of covering fi. Because the random sam- 
ple lives in the theoretical world, this is a theoretical 
statement. Nonetheless, substituting 

(1) X = x 
together with 

(2) x = 10.2 

we obtain the interval /, and are able to draw use- 
ful conclusions as long as our theoretical world is 
aligned well with the real world that produced the 
data. 



The main point here is that we do not need a long- 
run interpretation of probability, but we do have 
to be reminded that the unique-event probability 
of 0.95 remains a theoretical statement because it 
applies to random variables rather than data. Let 
us turn to the Bayesian case. 

Bayesian assumptions. Suppose X 1} X 2 , 
. . . , X n form a random sample from a N(fi, 1) distri- 
bution and the prior distribution of \i is N(fiQ,T 2 ), 
with r 2 > ^ and 49r 2 » \fi \. 

The posterior distribution of \i is normal, the pos- 
terior mean becomes 

r 2 1/49 

and the posterior variance is 

^=(49 + 1) 1 

but because r 2 3> ^ and 49r 2 3> |/io| we have 
/Irs 10.2 

and 

1 

v ~ — . 
49 

Therefore, the inferential interval I defined above 
has posterior probability 0.95. 

Bayesian interpretation of posterior in- 
terval. Under the assumptions above, the prob- 
ability that [i is in the interval I is 0.95. 

Pragmatic interpretation of posterior in- 
terval. If the data were a random sample for 
which (2) holds, that is, x = 10.2, and if the as- 
sumptions above were to hold, then the probability 
that \x is in the interval I would be 0.95. This refers 
to a hypothetical value x of the random variable 
X , and because X lives in the theoretical world the 
statement remains theoretical. Nonetheless, we are 
able to draw useful conclusions from the data as long 
as our theoretical world is aligned well with the real 
world that produced the data. 

Here, although the Bayesian approach escapes the 
indirectness of confidence within the theoretical 
world, it cannot escape it in the world of data anal- 
ysis because there remains the additional layer of 
identifying data with random variables. According 
to the pragmatic interpretation, the posterior is not, 
literally, a statement about the way the observed 
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data relate to the unknown parameter \x because 
those objects live in different worlds. The language 
of Bayesian inference, like the language of frequen- 
tist inference, takes a convenient shortcut by blur- 
ring the distinction between data and random vari- 
ables. 

The commonality between frequentist and Baye- 
sian inferences is the use of theoretical assumptions, 
together with a subjunctive statement. In both ap- 
proaches a statistical model is introduced — in the 
Bayesian case the prior distributions become part of 
what I am here calling the model — and we may say 
that the inference is based on what would happen 
if the data were to be random variables distributed 
according to the statistical model. This modeling as- 
sumption would be reasonable if the model were to 
describe accurately the variation in the data. 

4. IMPLICATIONS FOR TEACHING 

It is important for students in introductory statis- 
tics courses to see the subject as a coherent, princi- 
pled whole. Instructors, and textbook authors, may 
try to help by providing some notion of a "big pic- 
ture." Often this is done literally, with an illustra- 
tion such as Figure 3 (e.g., Lovett, Meyer and Thille, 
2008). This kind of illustration can be extremely use- 
ful if referenced repeatedly throughout a course. 

Figure 3 represents a standard story about statis- 
tical inference. Fisher introduced the idea of a ran- 
dom sample drawn from a hypothetical infinite pop- 
ulation, and Neyman and Pearson's work encour- 
aged subsequent mathematical statisticians to drop 
the word "hypothetical" and instead describe statis- 
tical inference as analogous to simple random sam- 
pling from a finite population. This is the concept 
that Figure 3 tries to get across. My complaint is 
that it is not a good general description of statistical 
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Fig. 3. The big picture of statistical inference according to 
the standard conception. Here, a random sample is pictured 
as a sample from a finite population. 



inference, and my claim is that Figure 1 is more ac- 
curate. For instance, in the psychophysical example 
of Hecht, Schlaer and Pirenne discussed in Section 
2, there is no population of "yes" or "no" replies 
from which a random sample is drawn. We do not 
need to struggle to make an analogy with a simple 
random sample. Furthermore, any thoughts along 
these lines may draw attention away from the most 
important theoretical assumptions, such as indepen- 
dence among the responses. Figure 1 is supposed to 
remind students to look for the important assump- 
tions, and ask whether they describe the variation 
in the data reasonably accurately. 

One of the reasons the population and sample pic- 
ture in Figure 3 is so attractive pedagogically is that 
it reinforces the fundamental distinction between 
parameters and statistics through the terms popula- 
tion mean and sample mean. To my way of thinking, 
this terminology, inherited from Fisher, is unfortu- 
nate. Instead of "population mean" I would much 
prefer theoretical mean, because it captures better 
the notion that a theoretical distribution is being 
introduced, a notion that is reinforced by Figure 1. 

I have found Figure 1 helpful in teaching basic 
statistics. For instance, when talking about random 
variables I like to begin with a set of data, where 
variation is displayed in a histogram, and then say 
that probability may be used to describe such vari- 
ation. I then tell the students we must introduce 
mathematical objects called random variables, and 
in defining them and applying the concept to the 
data at hand, I immediately acknowledge that this 
is an abstraction, while also stating that — as the stu- 
dents will see repeatedly in many examples — it can 
be an extraordinarily useful abstraction whenever 
the theoretical world of random variables is aligned 
well with the real world of the data. 

I have also used Figure 1 in my classes when de- 
scribing attitudes toward data analysis that statisti- 
cal training aims to instill. Specifically, I define sta- 
tistical thinking, as in the article by Brown and Kass 
(2009), to involve two principles: 

1. Statistical models of regularity and variability in 
data may be used to express knowledge and un- 
certainty about a signal in the presence of noise, 
via inductive reasoning. 

2. Statistical methods may be analyzed to deter- 
mine how well they are likely to perform. 

Principle 1 identifies the source of statistical infer- 
ence to be the hypothesized link between data and 
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Fig. 4. A more elaborate big picture, reflecting in greater detail the process of statistical inference. As in Figure 1, there is 
a hypothetical link between data and statistical models but here the data are connected more specifically to their representation 
as random variables. 



statistical models. In explaining, I explicitly distin- 
guish the use of probability to describe variation 
and to express knowledge. A probabilistic descrip- 
tion of variation would be "The probability of rolling 
a 3 with a fair die is 1/6" while an expression of 
knowledge would be "I'm 90% sure the capital of 
Wyoming is Cheyenne." These two sorts of state- 
ments, which use probability in different ways, are 
sometimes considered to involve two different kinds 
of probability, which have been called "aleatory prob- 
ability" and "epistemic probability" Bayesians mer- 
ge these, applying the laws of probability to go from 
quantitative description to quantified belief, but in 
every form of statistical inference aleatory probabil- 
ity is used, somehow, to make epistemic statements. 
This is Principle 1. Principle 2 is that the same sorts 
of statistical models may be used to evaluate sta- 
tistical procedures — though in the classroom I also 
explain that performance of procedures is usually 
investigated under varying circumstances. 

For somewhat more advanced audiences it is pos- 
sible to elaborate, describing in more detail the pro- 
cess trained statisticians follow when reasoning from 
data. A big picture of the overall process is given 
in Figure 4. That figure indicates the hypothetical 
connection between data and random variables, be- 
tween key features of unobserved mechanisms and 
parameters, and between real-world and theoretical 
conclusions. It further indicates that data display 
both regularity (which is often described in theo- 
retical terms as a "signal," sometimes conforming 



to simple mathematical descriptions or "laws") and 
unexplained variability, which is usually taken to be 
"noise." The figure also includes the components 
exploratory data analysis — EDA — and algorithms, 
but the main message of Figure 4, given by the la- 
bels of the two big boxes, is the same as that in 
Figure 1. 

5. DISCUSSION 

According to my understanding, laid out above, 
statistical pragmatism has two main features: it is 
eclectic and it emphasizes the assumptions that con- 
nect statistical models with observed data. The prag- 
matic view acknowledges that both sides of the fre- 
quentist-Bayesian debate made important points. 
Bayesians scoffed at the artificiality in using sam- 
pling from a finite population to motivate all of 
inference, and in using long-run behavior to define 
characteristics of procedures. Within the theoretical 
world, posterior probabilities are more direct, and 
therefore seemed to offer much stronger inferences. 
Frequentists bristled, pointing to the subjectivity of 
prior distributions. Bayesians responded by treating 
subjectivity as a virtue on the grounds that all in- 
ferences are subjective yet, while there is a kernel 
of truth in this observation — we are all human be- 
ings, making our own judgments — subjectivism was 
never satisfying ELS £1 logical framework: an impor- 
tant purpose of the scientific enterprise is to go be- 
yond personal decision-making. Nonetheless, from 
a pragmatic perspective, while the selection of prior 
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probabilities is important, their use is not so prob- 
lematic as to disqualify Bayesian methods, and in 
looking back on history the introduction of prior 
distributions may not have been the central bother- 
some issue it was made out to be. Instead, it seems 
to me, the really troubling point for frequentists 
has been the Bayesian claim to a philosophical high 
ground, where compelling inferences could be de- 
livered at negligible logical cost. Frequentists have 
always felt that no such thing should be possible. 
The difficulty begins not with the introduction of 
prior distributions but with the gap between models 
and data, which is neither frequentist nor Bayesian. 
Statistical pragmatism avoids this irritation by ac- 
knowledging explicitly the tenuous connection be- 
tween the real and theoretical worlds. As a result, 
its inferences are necessarily subjunctive. We speak 
of what would be inferred if our assumptions were 
to hold. The inferential bridge is traversed, by both 
frequentist and Bayesian methods, when we act as if 
the data were generated by random variables. In the 
normal mean example discussed in Section 4, the key 
step involves the conjunction of the two equations 
(1) and (2). Strictly speaking, according to statisti- 
cal pragmatism, equation (1) lives in the theoretical 
world while equation (2) lives in the real world; the 
bridge is built by allowing x to refer to both the 
theoretical value of the random variable and the ob- 
served data value. 

In pondering the nature of statistical inference I 
am, like others, guided partly by past and present 
sages (for an overview see Barnett, 1999), but also 
by my own experience and by watching many col- 
leagues in action. Many of the sharpest and most 
vicious Bayes-frequentist debates took place during 
the dominance of pure theory in academia. Statis- 
ticians are now more inclined to argue about the 
extent to which a method succeeds in solving a data 
analytic problem. Much statistical practice revolves 
around getting good estimates and standard errors 
in complicated settings where statistical uncertainty 
is smaller than the unquantified aggregate of many 
other uncertainties in scientific investigation. In such 
contexts, the distinction between frequentist and 
Bayesian logic becomes unimportant and con tem- 
porary practitioners move freely between frequentist 
and Bayesian techniques using one or the other de- 
pending on the problem. Thus, in a review of statis- 
tical methods in neurophysiology in which my col- 
leagues and I discussed both frequentist and Baye- 
sian methods (Kass, Ventura and Brown, 2005), not 



only did we not emphasize this dichotomy but we 
did not even mention the distinction between the 
approaches or their inferential interpretations. 

In fact, in my first publication involving analysis 
of neural data (Olson et al., 2001) we reported more 
than a dozen different statistical analyses, some fre- 
quentist, some Bayesian. Furthermore, methods from 
the two approaches are sometimes glued together in 
a single analysis. For example, to examine several 
neural firing-rate intensity functions A 1 (i), . . . , X p (t), 
assumed to be smooth functions of time t, Behseta et 
al. (2007) developed a frequentist approach to test- 
ing the hypothesis Hq : A x (i) = • • • = A p (i), for all t, 
that incorporated BARS smoothing. Such hybrids 
are not uncommon, and they do not force a prac- 
titioner to walk around with mutually inconsistent 
interpretations of statistical inference. Figure 1 pro- 
vides a general framework that encompasses both 
of the major approaches to methodology while em- 
phasizing the inherent gap between data and mod- 
eling assumptions, a gap that is bridged through 
subjunctive statements. The advantage of the prag- 
matic framework is that it considers frequentist and 
Bayesian inference to be equally respectable and al- 
lows us to have a consistent interpretation, without 
feeling as if we must have split personalities in or- 
der to be competent statisticians. More to the point, 
this framework seems to me to resemble more closely 
what we do in practice: statisticians offer inferences 
couched in a cautionary attitude. Perhaps we might 
even say that most practitioners are subjunctivists. 

I have emphasized subjunctive statements partly 
because, on the frequentist side, they eliminate any 
need for long-run interpretation. For Bayesian meth- 
ods they eliminate reliance on subjectivism. The 
Bayesian point of view was articulated admirably 
by Jeffreys (see Robert, Chopin and Rousseau, 2009, 
and accompanying discussion) but it became clear, 
especially from the arguments of Savage and sub- 
sequent investigations in the 1970s, that the only 
solid foundation for Bayesianism is subjective (see 
Kass and Wasserman, 1996, and Kass, 2006). Sta- 
tistical pragmatism pulls us out of that solipsistic 
quagmire. On the other hand, I do not mean to im- 
ply that it really does not matter what approach 
is taken in a particular instance. Current attention 
frequently focuses on challenging, high-dimensional 
datasets where frequentist and Bayesian methods 
may differ. Statistical pragmatism is agnostic on 
this. Instead, procedures should be judged according 
to their performance under theoretical conditions 
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thought to capture relevant real-world variation in 
a particular applied setting. This is where our juxta- 
position of the theoretical world with the real world 
earns its keep. 

I called the story about statistical inference told 
by Figure 3 "standard" because it is imbedded in 
many introductory texts, such as the path-breaking 
book by Freedman, Pisani and Purves (2007) and 
the excellent and very popular book by Moore and 
McCabe (2005). My criticism is that the standard 
story misrepresents the way statistical inference is 
commonly understood by trained statisticians, por- 
traying it as analogous to simple random sampling 
from a finite population. As I noted, the population 
versus sampling terminology comes from Fisher, but 
I believe the conception in Figure 1 is closer to Fi- 
sher's conception of the relationship between theory 
and data. Fisher spoke pointedly of a hypothetical 
infinite population, but in the standard story of Fig- 
ure 3 the "hypothetical" part of this notion — which 
is crucial to the concept — gets dropped (confer also 
Lenhard, 2006). I understand Fisher's "hypotheti- 
cal" to connote what I have here called "theoreti- 
cal." Fisher did not anticipate the co-option of his 
framework and was, in large part for this reason, 
horrified by subsequent developments by Neyman 
and Pearson. The terminology "theoretical" avoids 
this confusion and thus may offer a clearer represen- 
tation of Fisher's idea. 1 

We now recognize Neyman and Pearson to have 
made permanent, important contributions to sta- 
tistical inference through their introduction of hy- 
pothesis testing and confidence. From today's van- 
tage point, however, their behavioral interpretation 
seems quaint, especially when represented by their 
famous dictum, "We are inclined to think that as 
far as a particular hypothesis is concerned, no test 
based upon the theory of probability can by itself 
provide any valuable evidence of the truth or false- 
hood of that hypothesis." Nonetheless, that inter- 
pretation seems to have inspired the attitude be- 
hind Figure 3. In the extreme, one may be led to 
insist that statistical inferences are valid only when 
some chance mechanism has generated the data. The 
problem with the chance-mechanism conception is 
that it applies to a rather small part of the real 
world, where there is either actual random sampling 



1 Fisher also introduced populations partly because he used 
long-run frequency as a foundation for probability, which sta- 
tistical pragmatism considers unnecessary. 



or situations described by statistical or quantum 
physics. I believe the chance-mechanism conception 
errs in declaring that data are assumed to be random 
variables, rather than allowing the gap of Figure 1 
to be bridged 2 by statements such as (2). In say- 
ing this I am trying to listen carefully to the voice 
in my head that comes from the late David Freed- 
man (see Freedman and Ziesel, 1988). I imagine he 
might call crossing this bridge, in the absence of an 
explicit chance mechanism, a leap of faith. In a strict 
sense I am inclined to agree. It seems to me, how- 
ever, that it is precisely this leap of faith that makes 
statistical reasoning possible in the vast majority of 
applications. 

Statistical models that go beyond chance mecha- 
nisms have been central to statistical inference since 
Fisher and Jeffreys, and their role in reasoning has 
been considered by many authors (e.g., Cox, 1990; 
Lehmann, 1990). An outstanding issue is the ex- 
tent to which statistical models are like the theo- 
retical models used throughout science (see Stan- 
ford, 2006). I would argue, on the one hand, that 
they are similar: the most fundamental belief of any 
scientist is that the theoretical and real worlds are 
aligned. On the other hand, as observed in Section 
2, statistics is unique in having to face the gap be- 
tween theoretical and real worlds every time a model 
is applied and, it seems to me, this is a big part of 
what we offer our scientific collaborators. Statisti- 
cal pragmatism recognizes that all forms of statisti- 
cal inference make assumptions, assumptions which 
can only be tested very crudely (with such things as 
goodness-of-fit methods) and can almost never be 
verified. This is not only at the heart of statistical 
inference, it is also the great wisdom of our field. 

ACKNOWLEDGMENTS 

This work was supported in part by NIH Grant 
MH064537. The author is grateful for comments on 
an earlier draft by Brian Junker, Nancy Reid, Steven 
Stigler, Larry Wasserman and Gordon Weinberg. 

REFERENCES 

Barnett, V. (1999). Comparative Statistical Inference, 3rd 
ed. Wiley, New York. MR0663189 

2 Because probability is introduced with the goal of drawing 
conclusions via statistical inference, it is, in a philosophical 
sense, "instrumental." See Glymour (2001). 



STATISTICAL INFERENCE 



9 



Behseta, S., Kass, R. E., Moorman, D. and Olson, C. R. 

(2007). Testing equality of several functions: Analysis of 

single-unit firing rate curves across multiple experimental 

conditions. Statist. Med. 26 3958-3975. MR2395881 
Brown, E. N. and Kass, R. E. (2009). What is statistics? 

(with discussion). Amer. Statist. 63 105-123. 
Cox, D. R. (1990). Role of models in statistical analysis. 

Statist. Set. 5 169-174. MR1062575 
DiMatteo, I., Genovese, C. R. and Kass, R. E. (2001). 

Bayesian curve-fitting with free-knot splines. Biometrika 

88 1055-1071. MR1872219 
Freedman, D., Pisani, R. andPuRVES, R. (2007). Statistics, 

4th ed. W. W. Norton, New York. 
Freedman, D. and Ziesel (1988). From mouse-to-man: The 

quantitative assessment of cancer risks (with discussion). 

Statist. Sci. 3 3-56. 
Glymour, C. (2001). Instrumental probability. Monist 84 

284-300. 

Hecht, S., Schlaer, S. and Pirenne, M. H. (1942). Energy, 
quanta and vision. J. Gen. Physiol. 25 819-840. 

KASS, R. E. (2006). Kinds of Bayesians (comment on articles 
by Berger and by Goldstein). Bayesian Anal. 1 437-440. 
MR2221277 

Kass, R. E., Ventura, V. and Brown, E. N. (2005). Sta- 
tistical issues in the analysis of neuronal data. J. Neuro- 
physiol. 94 8-25. 

Kass, R. E. and Wasserman, L. A. (1996). The selection of 
prior distributions by formal rules. J. Amer. Statist. Assoc. 
91 1343-1370. MR1478684 



Lehmann, E. L. (1990). Model specification: The views of 
Fisher and Neyman, and later developments. Statist. Sci. 
5 160-168. MR1062574 

Lenhard, J. (2006). Models and statistical inference: The 
controversy between Fisher and Neyman-Pearson. British 
J. Philos. Sci. 57 69-91. MR2209772 

Lovett, M., Meyer, O. and Thille, C. (2008). The open 
learning initiative: Measuring the effectiveness of the OLI 
statistics course in accelerating student learning. J. Inter- 
act. Media Educ. 14. 

Moore, D. S. and McCabe, G. (2005). Introduction to the 
Practice of Statistics, 5th ed. W. H. Freeman, New York. 

Olson, C. R., Gettner, S. N., Ventura, V., Carta, R. 
and Kass, R. E. (2001). Neuronal activity in macaque 
supplementary eye field during planning of saccades in re- 
sponse to pattern and spatial cues. J. Neurophysiol. 84 
1369-1384. 

Robert, C. P., Chopin, N. and Rousseau, J. (2009). 
Harold Jeffreys' theory of probability revisited (with dis- 
cussion). Statist. Sci. 24 141-194. MR2655841 

Rollenhagen, J. E. and Olson, C. R. (2005). Low- 
frequency oscillations arising from competitive interactions 
between visual stimuli in macaque inferotemporal cortex. 
J. Neurophysiol. 94 3368-3387. 

Stanford, P. K. (2006). Exceeding Our Grasp. Oxford Univ. 
Press. 

Wallstrom, G., Liebner, J. and Kass, R. E. (2008). 
An implementation of Bayesian adaptive regression splines 
(BARS) in C with S and R wrappers. J. Statist. Software 
26 1-21. 



